1 |
A simple language-agnostic yet very strong baseline system for hate speech and offensive content identification ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Using Fisher's Exact Test to Evaluate Association Measures for N-grams ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
LAST at CMCL 2021 Shared Task: Predicting Gaze Data During Reading with a Gradient Boosting Decision Tree Approach ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
LAST at SemEval-2021 Task 1: Improving Multi-Word Complexity Prediction Using Bigram Association Measures ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Getting rid of the Chi-square and Log-likelihood tests for analysing vocabulary differences between corpora
|
|
|
|
In: Bestgen, Yves. Getting rid of the Chi-square and Log-likelihood tests for analysing vocabulary differences between corpora. En: Quaderns de filología. Estudis lingüístics, 22 2017: 33-56 (2017)
|
|
BASE
|
|
Show details
|
|
8 |
Using n-grams to map registers across languages and uncover cross-linguistic contrasts: Insights from Correspondence Analysis
|
|
|
|
In: CBL (Cercle Belge de Linguistique) 2016 ; https://hal.archives-ouvertes.fr/hal-01426811 ; CBL (Cercle Belge de Linguistique) 2016, May 2016, Louvain-la-Neuve, Belgium (2016)
|
|
BASE
|
|
Show details
|
|
9 |
Vers une analyse des différences interlinguistiques entre les genres textuels : étude de cas basée sur les n-grammes et l'analyse factorielle des correspondances
|
|
|
|
In: TALN 2016: Traitement Automatique des Langues Naturelles ; https://hal.archives-ouvertes.fr/hal-01426820 ; TALN 2016: Traitement Automatique des Langues Naturelles, Jul 2016, Paris, France (2016)
|
|
Abstract:
International audience ; The aim of the present study is to assess the use of n-grams and Correspondence Analysis (CA) to compare genres in cross-linguistic studies. The study is based on an English-French bilingual corpus made up of original (i.e. non-translated) texts, representing three genres: European parliamentary debates, newspaper editorials and academic articles. First, 2- to 4-grams are extracted in each language. Second, the most frequent 1000 n-grams for each n-gram length and in each language are analyzed by means of CA with a view to determining which n-grams are particularly salient in the genres examined. Finally, n-grams are manually classified into a range of categories, such as stance expressions, discourse markers and referential expressions. The results show that the n-gram approach makes it possible to uncover typical features of the three genres investigated, as well as interesting contrasts between English and French.
|
|
Keyword:
[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing; [SHS.LANGUE]Humanities and Social Sciences/Linguistics; Comparable Corpora; Correspondence Analysis; Genres; N-grams
|
|
URL: https://hal.archives-ouvertes.fr/hal-01426820/file/lefer-TALN2016short.pdf https://hal.archives-ouvertes.fr/hal-01426820/document https://hal.archives-ouvertes.fr/hal-01426820
|
|
BASE
|
|
Hide details
|
|
10 |
Exact Expected Average Precision of the Random Baseline for System Evaluation
|
|
|
|
In: Prague Bulletin of Mathematical Linguistics , Vol 103, Iss 1, Pp 131-138 (2015) (2015)
|
|
BASE
|
|
Show details
|
|
14 |
Construction automatique de ressources lexicales pour la fouille d'opinion. ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
How to determine the meaning and use of (causal) connectives in (large) corpora : from hand-based to automatic analyses
|
|
|
|
In: Electronic Document Week (SDN 2004), Workshop ATALA "Modelling and describing discourse organisation in the age of the digital document", La Rochelle ; https://archivesic.ccsd.cnrs.fr/sic_00001224 ; Jun 2004 (2004)
|
|
BASE
|
|
Show details
|
|
|
|